Testing and validating machine learning classifiers by metamorphic testing
نویسندگان
چکیده
Machine Learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no "test oracle" to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique "metamorphic testing", which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program.
منابع مشابه
A Machine Learning Approach for Developing Test Oracles for Testing Scientific Software
Absence of test oracles is the grand challenge for testing complex scientific software. Metamorphic testing is the novel technique for developing test oracles on metamorphic relations. Although it is easy to find metamorphic relations based on general guidelines and domain knowledge, the ones that can adequately test the software are difficult to be developed. This paper introduces a machine le...
متن کاملUsing Semi-Supervised Learning for Predicting Metamorphic Relations
Software testing is difficult to automate, especially in programs which have no oracle, or method of determining which output is correct. Metamorphic testing is a solution this problem. Metamorphic testing uses metamorphic relations to define test cases and expected outputs. A large amount of time is needed for a domain expert to determine which metamorphic relations can be used to test a given...
متن کاملProperties of Machine Learning Applications for Use in Metamorphic Testing
It is challenging to test machine learning (ML) applications, which are intended to learn properties of data sets where the correct answers are not already known. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing, in which properties of the application are exploited to define transformation functions on the input, such that the new output ...
متن کاملA Machine Learning Based Framework for Verification and Validation of Massive Scale Image Data
Big data validation and system verification are crucial for ensuring the quality of big data applications. However, a rigorous technique for such tasks is yet to emerge. During the past decade, we have developed a big data system called CMA for investigating the classification of biological cells based on cell morphology that is captured in diffraction images. CMA includes a group of scientific...
متن کاملPredicting metamorphic relations for testing scientific software: a machine learning approach using graph kernels
Comprehensive, automated software testing requires an oracle to check whether the output produced by a test case matches the expected behavior of the program. But the challenges in creating suitable oracles limit the ability to perform automated testing in some programs, and especially in scientific software. Metamorphic testing is a method for automating the testing process for programs withou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The Journal of systems and software
دوره 84 4 شماره
صفحات -
تاریخ انتشار 2011